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ABSTRACT 


The  primary  subject  of  discussion  in  this  report  is  the  conceptual  aspects  of  CHILD 
(Cognitive  Hybrid  Intelligent  Learning  Device).  CHILD  is  a  self-adaptive  learning 
machine  which  was  conceived,  designed,  and  constructed  at  the  Information  Processing 
Lab,  Rome  Air  Development  Center. 

An  attempt  is  made  to  describe  learning  machines  in  a  functional  sense  in  order  to 
isolate  the  unique  properties  of  this  voncept.  To  do  this,  learning  machines  must  be 
placed  on  some  common  ground.  Therefore,  adaptive  learning  devices  will  be  viewed  as 
networks  of  redundant  adaptive  elements  which  are  capable  of  being  organized  by  some 
"learning”  logic.  The  common  function  performed  by  the  learning  machines  under  con¬ 
sideration  here  consists  basically  of  a  remapping  of  the  sensory  space  in  some  manner 
which  will  enable  decision  elements  to  divide  the  remapped  sensory  inputs  into  various 
classes.  The  primary  adaptive  function  of  such  machines  is  (or  should  be)  the  determi* 
nation  of  the  transformation(s)  required  in  order  to  successfully  solve  the  given  problem. 
Vith  this  common  basis  for  comparison,  the  unique  properties  of  CHILD  as  a  new  concept 
in  artificial  intelligence  should  become  apparent. 
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A  NEW  CONCEPT  IN  ARTIFICIAL  INTELLIGENCE 


♦ 


The  object  of  this  report  is  to  describe  the  conceptual  aspects  of  CHILD  (Cognitive 
Hybrid  Intelligent  Learning  Device)  in  such  a  manner  as  to  permit  the  reader  to  perform 
a  valid  comparison  vith  other  learning  machine  concepts  which  intend  to  perform  a  simi¬ 
lar  function. 


In  order  to  discuss  learning  machines,  they  must  be  placed  on  some  common  ground. 
The  function  performed  by  the  learning  machines  under  consideration  here  consists  basi¬ 
cally  of  a  remapping  of  the  sensory  space  in  some  manner  which  will  enable  decision 
elements  to  divide  the  remapped  sensory  inputs  into  various  classes.  The  primary  adap¬ 
tive  portion  of  such  machines  is  (or  should  be)  the  determination  of  the  transformation(s) 
required  in  order  to  successfully  solve  the  given  problem.  A  further  function  of  these 
machines  is  to  then  subdivide  the  new  sensory  space  such  that  the  decision  elements  can 
act  on  the  remapped  inputs  in  order  to  perform  the  classification  function. 

The  input  to  CHILD  consists  of  n  analog  values,  which  can  be  thought  of  as  an  »- 
dimensional  analog  vector.  This  input  vector  can  be  derived  either  directly  from  the 
sensors,  or  from  some  preprocessing  technique  utilized  to  extract  characteristics  (or 
features)  from  the  sensory  pattern.  The  fact  that  many  such  extraction  techniques  yield 
analog  values  constitutes  the  primary  reason  for  the  choice  of  analog  inputs  for  CHILD. 

It  is  CHILD’S  primary  purpose,  then,  to  determine  (1)  which  components  of  the  input 
vector  are  important,  (2)  the  range  of  acceptable  values  each  component  may  assume,  and 
(3)  the  degree  of  importance  to  be  assigned  to  each  component. 

The  basic  element  in  CHILD  has  a  transfer  function  that  causes  an  output  only  if  the 
input  satisfies  certain  criteria,  i.e.,  falls  between  two  stored  values.  The  output  thus 
caused  consists  of  a  third  stored  value.  The  transfer  function  of  a  CHILD  cell,  then,  is 
shown  in  Figure  1.  The  circuitry  following  each  CHILD  cell  sees  w y  when  the  input 
stimulus  falls  between  6] \-u  and  Connecting  n  of  these  cells  in  parallel  to  form  a 
row  of  cells,  and  arranging  rows  of  cells  in  a  parallel  array  results  in  a  functional  dia¬ 


gram  as  shown  in  Figure  2. 

The  analog  outputs  from  the  cells 
in  each  row  are  added  together  and 
compared  with  a  fixed  threshold  on 
the  right-hand  side  of  the  array.  The 
outputs  of  the  threshold  devices  are 
a  binary  indication  that  the  input 
stimulus  has  satisfied  the  require¬ 
ments  of  enough  cells  in  a  row  to 
cause  the  threshold  for  that  row  to  be 
exceeded.  The  adaptive  procedure 


consist,  of  adjusting  the  <*/»  “d  Figur.  1.  CHILD  Cell  Transfer  Function. 

w(j' s  in  the  machine  to  cause  the  out¬ 
puts  of  the  threshold  devices  to  correctly  classify  the  input  stimuli.  The  adaptation  logic 


has  been  derived,  and  will  be  carefully  described  in  the  next  section  concerning  the 


theoretical  basis  for  CHILD,  as  compared  to  other  machines. 
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Figure  2.  CHILD  Functional  Diagram. 


As  was  mentioned  previously,  the  common  function  of  learning  machines  is  that  of 
remapping  the  sensory  space  into  a  new  space  where  the  decision  elements  perform  the 
required  classification.  A  general  learning  machine  might  then  be  represented  function- 
ally  as  shown  in  Figure  3-  In  order  to  compare  and  contrast  CHILD  with  other  learning 
devices  we  shall  analyze  a  typical  (non-CHILD)  mapping  element  of  the  remapping  layer 
(Figure  4)  and  explain  how  the  remapping  function  is  accomplished. 


Sensory  Re-mapplng  Decision 

Space  Layer  Layer 


Figure  3.  General  Learning  Machine 
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S  -  input  vector  -  «/. ,  +  +  «8T„  +  . . .  + 

?  -  weighting  vector  +  *>Jt2  +  “V«s  +  •  •  •  + 

sA  -  3-T  »  j  jio  !  +  «  2W2  +  *$w$  +  •  •  •  +  8nwn 
Co  itsA<$ 

**  ■  4  if  Sg  >  0 

Equation  of  Hyperplane:  *  1»1  +  *2»2  +  *{M'S  +  ...  +  »,i»,  “  0 


Figure  4.  Typical  Remapping  Element. 


Typically  an  input  vector  defined  in  an  n-dimensional  sensory  space  is  operated  on 
in  each  remapping  element  by  a  weighting  vector.  The  components  of  the  weighting 
vector  are  defined  by  stored  variable  weights  so  that  the  orientation  of  the  weighting 
vector  in  the  sensory  space  can  be  altered  by  changing  the  values  of  the  stored  weights. 
The  operation  performed  on  the  input  vector  is  merely  the  scalar  product  of  the  input  vec* 
tor  with  the  weighting  vector.  Thus  the  signal  at  point  A  is 

Sg  -  +  'a1*!  +  ...  +  *,»,• 

The  signal  SA  is  then  passed  through  a  threshold  set  equal  to  0.  If  SA  is  equated  to  0, 
SAm9m*lwl  +  *iwi  +  •••  +  $nwn> 

we  obtain  the  equation  of  a  hyperplane  in  the  n-dimensional  sensory  space.  The  output 
of  the  threshold  element  (Sg)  is  equal  to  one  if  SA  >  0  and  equal  to  zero  if  SA  <0.  Thus 
the  remapping  element  places  a  hyperplane  in  the  sensory  space  and  maps  every  input 
on  one  side  of  the  hyperplane  into  Sg  -  1  and  all  inputs  on  the  other  side  into  Sg  >  0. 

Since  there  may  be  many  remapping  elements  in  parallel,  making  up  the  remapping 
layer,  there  are  the  same  number  of  weighting  vectors,  each  defining  a  hyperplane  in  the 
sensory  space.  Thus  we  see  how  the  sensory  space  is  partitioned  and  mapped  into  a  new 
space.  A  similar  weighting  vector  is  defined  (this  vector  may  be  fixed  or  adaptable)  in 
the  new  space  where  a  decision  is  made.  The  learning  logic  then  controls  the  adaptable 
weights  which  in  turn  locate  and  orient  the  hyperplanes  in  the  sensory  space. 

The  input  parameters  sensed  by  the  typical  CHILD  mapping  element  (Figure  5)  are 
passed  through  two-sided  adaptable  thresholds  designated  by  By.  The  output  of  each 
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Figure  5.  Typical  CHILD  Row  i. 


®</  *s  "®"  M  fy/i  >  af  >  dyu,  and  "1"  if  6Lj  <  «y  <  6{..  This  output  is  then  weighted  by 
and  summed  with  similar  signals  of  the  row  i.  The  output  of  the  summer  is 


the  input  is  said  to  be  in  class  i.  The  mapping  element  in  CHILD,  then,  consists  of  all 
the  cells  in  a  row. 


Now  let  us  analyze  the  mapping  function  of  CHILD.  A  8^  threshold  defines  two 
parallel  hyperplaoes  in  the  sensory  space  which  are  perpendicular  to  the  input  axis  sj . 

If  the  end  of  the  input  vector  (S)  lies  within  the  region  between  the  parallel  planes,  the 
•j  component  of  3  initiates  a  vote  of  magnitude  for  the  class  i.  Since  each  cell  has 
a  8  which  is  independent  of  every  other  cell’s  8,  the  sensory  space  is  partitioned  in  a 
controllable  manner  by  sets  of  parallel  hyperplanes.  Thus,  as  seen  in  Figure  6,  the  cells 
of  each  row  partition  the  sensory  space  into  regions,  and  maps  these  regions  onto  a  real 
line.  Referring  to  Figure  6,  region  A  is  mapped  onto  the  real  line  at  point  (« v(l  +  wi2), 
region  B  into  point  w<2  and  region  C  into  point  wtl.  The  fixed  threshold  T(  then  makes 
the  decision  as  to  whether  S  is  in  class  ».  The  weights  in  the  regions  defined  by  the 
sets  of  parallel  planes  and  the  positions  of  the  planes  are  controlled  by  the  learning 
logic,  a  description  of  which  follows. 

CHILD  has  a  binary  output  for  each  class  of  inputs.  Therefore,  in  a  learned  state, 
an  input  stimulus  will  cause  one  and  only  one  output  to  occur.  CHILD’S  goal  is  to 
achieve  this  condition. 

The  teaching  procedure  is  as  follows:  a  stimulus  of  class  i  is  presented  to  the 
machine  and  each  cell  of  row  i  is  given  a  command  to  place  the  stimulus  between  their 
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Figure  6.  Mapping  in  Row  i  Performed  by  Ceils  C;j  and  Cjj  in  Two- 
dimensional  Sensory  Space. 

respective  parallel  hyperplanes.  Each  set  of  parallel  hyperplanes  is  separated  by  some 
arbitrarily  small  distance  e.  When  this  condition  is  met,  a  region  such  as  region  A  in 
Figure  6  is  defined  in  the  sensory  space.  (Region  A  is  an  acceptance  region  in  the 
sensory  space  which  is  common  to  every  cell  in  the  row.  Ve  shall  refer  to  such  common 
regions  as  primary  regions  throughout  the  rest  of  the  paper.  Regions  such  as  B  or  C 
shall  be  referred  to  as  secondary  regions,  i.e.,  regions  which  are  common  to  all  cells 
but  one  in  a  row.  In  higher  dimension  sensory  spaces  a  ternary  space  is  defined  as 
regions  which  are  common  to  all  but  two  cells  in  a  row,  etc.).  Now  in  order  to  cause 
threshold  T(  to  be  exceeded,  all  the  weights  of  row  i  are  increased  until  the  threshold  is 
crossed,  indicating  that  the  present  stimulus  is  a  member  of  class  i.  This  abstraction 
procedure  can  then  be  followed  for  samples  of  stimuli  belonging  to  classes  not  yet 
taught.  At  this  point  CHILD  has  no  generalisation  capability  since  the  acceptance 
regions  for  each  class  are  minimum  size  primary  regions  seattered  throughout  the  sensory 
space. 

Generalization  for  a  class  is  achieved  by  increasing  the  size  of  the  acceptance 
regions.  This  may  be  accomplished  in  two  ways:  (1)  by  expanding  the  primary  region, 
or  (2)  by  increasing  the  weights  in  secondary  (or  ternary,  etc.)  regions  so  that  stimuli 
in  these  regions  cause  their  respective  thresholds  to  be  exceeded.  Assume  CHILD  is 
now  shown  an  additional  stimulus  in  (previously  taught)  class  k.  It  is  probable  that 
CHILD  will  not  respond  correctly  since  the  machine  has  essentially  only  a  rote  memory. 
Therefore  the  instruct  switch  would  now  be  set  which  would  activate  the  following  logic: 
If  row  k  does  not  have  an  output  (which  would  probably  be  the  case),  and  cell  Ckm  does 
not  have  an  output  (i.e.,  the  stimulus  does  not  fall  between  dkml  and  6kmu),  then 
either  dkml  or  is  moved  toward  the  stimulus  aM,  depending  on  which  is  closer  to 
,  until  cell  Ckm  has  an  output.  A  second  logic  rule  is  activated  at  the  same  time 
which  states  that  if  row  k  does  not  have  an  output  and  cell  Ckn  doe a  have  an  output 
(equal  to  u>kn )  then  is  increased. 


5 


These  two  operations  are  continued  until  row  k  has  an  output,  indicating  that  the 
present  stimulus  is  of  class  k.  This  logic  will  correct  the  error  occurring  when  CHILD 
fails  to  classify  a  stimulus  of  class  k  into  class  k. 

The  other  type  of  error  which  can  possibly  occur  is  when  a  stimulus  of  class  k  is 
classified  as  being  a  member  of  class  /.  In  this  case  the  logic  will  decrease  all  w-r’ s 
which  are  contributing  to  the  erroneous  response.  This  means  that  all  w.f  *s  of  the  cells 
that  have  an  output  in  the  row  /  will  be  decreased  until  row  /  no  longer  has  an  output. 
Through  the  use  of  this  generalization  logic  (to  correct  both  types  of  errors),  the  features 
common  to  two  or  more  classes  are  therefore  weighted  lower  than  are  the  unique  features 
of  each  class.  Now  CHILD  has  a  generalization  ability  since  the  acceptance  regions 
have  been  expanded  in  a  controlled  manner. 

It  might  be  advisable  at  this  point  to  summarize  the  instruction  logic  which  CHILD 
employs 

Let  us  define  R k  -  row  k  has  an  output 

Sk  m  row  k  should  have  an  output 
Chi  -  cell  ki  has  an  output. 

If  CHILD  makes  an  error  after  the  abstraction  procedure,  the  following  logic  is  imple¬ 
mented  by  setting  the  instruct  switch: 

*h\CH  -  decrease 

S4.S4.?W  -  move  6k{u  or  6kil  toward  the  input  depending  upon  which 
one  is  closer  to  8f 
Rj-Sj-Cy  -  increase 

For  all  other  possibilities  there  is  no  change  in  the  adaptable  parameters. 

The  efficiency  of  CHILD  is  obviously  dependent  to  a  great  extent  upon  the  nature  of 
the  analog  sensory  space.  It  can  be  seen  from  the  previous  discussion  that  CHILD  is 
capable  of  organizing  itself  such  that  only  the  important  characteristics  for  each  class 
are  relied  upon  to  make  the  discriminations  required.  This  is.done  ip  an  independent 
fashion  such  that  different  characteristics  can  be  found  important  for  different  classes. 

In  addition  CHILD  makes  the  determination  as  to  what  specific  range  of  values  are 
acceptable  for  each  characteristic.  It  is  therefore  not  required  that  the  analog  inputs  be 
absolute  invariances  (although  this  is  obviously  desirable I),  and  in  many  instances, 
simply  scaling  and/or  normalization  of  numbers  obtained  from  the  real  world  are  accep¬ 
table. 

Pattern  recognition  problems  of  more  complex  nature  require  that  more  thought  and 
ingenuity  be  devoted  to  the  "front  end"  design,  but  it  relieves  the  burden  a  great  deal  to 
keep  in  mind  that  CHILD  will  decide  which  values  of  which  characteristics  are  important 
(as  well  as  their  degree  of  importance)  to  make  the  proper  classification  of  the  input 
stimuli. 
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